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Fatigue  is  the  most  frequently  eited  physiological  factor  contributing  to  the  occurrence  of  US  Naval  Aviation 
flight  mishaps  (Naval  Safety  Center,  2006).  The  Navy  and  other  military  services  have  invested  significant 
resources  in  the  development  of  means  to  manage  and  mitigate  fatigue  in  operational  settings.  Foremost  among 
these  investments  is  the  development  of  fatigue  modeling/scheduling  tools,  the  primary  function  of  which  is  to 
inform  mission  scheduling  to  minimize  fatigue  and  improve  safety  and  operational  effectiveness.  Although 
generalized  fatigue  modeling  tools,  such  as  the  Fatigue  Avoidance  Scheduling  Tool  (FAST),  are  increasingly  used 
in  military  settings,  currently  there  is  no  established  tool  available  to  assess  an  individual  aviator’s  actual  real-time 
level  of  fatigue  or  general  physiological  readiness.  Recent  evidence  suggests  large  individual  differences  in  fatigue 
resistance  exist  (Van  Dongen,  Caldwell,  &  Caldwell,  2006;  Killgore,  Grugle,  Reichardt,  Killgore,  &  Balkin,  2009), 
pointing  to  the  need  to  supplement  general  models  of  fatigue  with  individualized  fatigue  measurement  and 
modeling.  Accordingly,  the  Naval  Safety  Center  (NSC)  has  identified  the  need  for  a  quickly-administered 
individualized  fatigue  assessment  tool  to  determine  a  pilot  or  aircrew  member’s  readiness-to-fly. 

In  response  to  this  need,  NAMRL  was  funded  by  the  Bureau  of  Medicine  and  Surgery  (BUMED)  Medical 
Development  Program  to  conduct  validation  research  of  several  cognitive  and  physiological  test  instruments  for 
their  potential  to  serve  as  individualized  fatigue  detection  tools.  The  instruments  evaluated  included  Flight  Fit,  a 
brief  (appx.  7  to  8  minute)  computer-based  cognitive  test  battery.  Flight  Fit  is  composed  of  tasks  measuring 
cognitive  abilities  crucial  for  handling  heavy  mental  work  load  and  sensitive  to  the  effects  of  fatigue  (e.g.,  time- 
estimation,  decision-making,  short-term  memory).  The  second  primary  instrument  evaluated  was  PMI  Fit  2000, 
which  measures  several  oculometric  characteristics  putatively  sensitive  to  the  effects  of  fatigue,  including,  pupil 
diameter,  pupil  constriction  amplitude  and  latency,  and  saccadic  velocity.  In  addition  to  the  two  main  instruments, 
the  Psychomotor  Vigilance  Task  (PVT),  a  gold  standard  in  detecting  fatigue;  Synthetic  Work  for  Windows 
(SynWin),  a  test  of  working  memory  and  cognitive  load;  simulated  flight  performance  with  X-Plane  9,  an 
ecologically  valid,  aviation-specific,  measure  of  vigilance;  and  the  Stanford  Sleepiness  Scale,  a  subjective 
assessment  of  sleepiness  were  evaluated  For  purposes  of  secondary  analysis,  performance  was  predicted  using 
fatigue  and  performance  modeling  software,  the  Fatigue  Avoidance  Scheduling  Tool  (FAST).  Subjects’  baseline 
sleep/wake  data  were  collected  via  actigraphy  and  entered  into  the  FAST  models. 

Fifteen  study  participants  were  observed  over  a  three  day  period.  During  days  one  and  two,  baseline  test 
performance  data  were  collected,  in  addition  to  actigraphic  data  on  participants’  sleep/wake  patterns.  Day  three 
consisted  of  a  25  hour  period  of  continual  wakefulness  (0300  hours  to  0400  hours),  during  which  test  and 
performance  data  were  collected  at  three  hour  intervals.  It  was  hypothesized  that  over  the  course  of  25  hours  of 
continual  wakefulness,  participants  would  exhibit  decrements  on  cognitive  (Flight  Fit)  and  physiological  (PMI  FIT 
2000)  measures.  Additionally,  it  was  hypothesized  that  secondary  validity  indices  would  demonstrate  concomitant 
performance  decrements  due  to  sleep  loss,  evidenced  through  performance  on  the  PVT,  SynWin,  and  X-Plane  flight 
simulator,  and  through  reports  of  subjective  sleepiness  on  the  SSS.  Although  it  was  anticipated  that  group 
performance  decrements  would  be  predicted  by  FAST  modeling,  it  was  hypothesized  that  some  measures  of  fatigue 
would  exhibit  significant  individual  differences,  and  that  the  addition  of  these  measures  would  incrementally 
improve  the  prediction  of  fatigued  task  performance  over  FAST  alone. 

Analyses  and  results  are  discussed  in  detail  in  three  Stages.  Stage  1  establishes  the  relation  of  significant 
measures  to  decrement  across  time  spent  without  sleep,  and  therefore  fatigue.  Stage  2  further  explores  significant 
Stage  1  relations  as  predictors  of  fatigue-related  performance  (PVT  lapses)  at  group  and  individual  levels.  Stage  3 
uses  results  from  Stages  1  and  2  to  inform  the  construction  of  optimal  group  scoring  algorithms  to  predict  fatigue- 
related  performance.  Aspects  of  Flight  Fit  and  PMI  Fit  2000  showed  significant  predictive  ability  across  all  three 
Stages  of  analysis,  with  individual  variability  playing  a  significant  role  when  examined  in  Stage  2.  The  findings 
suggest  that  basic  cognitive  and  physiologic  tasks  can  successfully  measure  fatigue,  and  that  both  are  necessary  for 
optimal  measurement.  Further,  scores  on  subsets  of  these  same  tasks  can  differentiate  an  individual’s  personal  level 
of  fatigue  susceptibility  above  and  beyond  the  current  industry  standard  tool.  Finally,  combining  the  individual 
diagnostic  power  of  Flight  Fit  and  PMI  Fit  2000  with  established  group  measures  such  as  FAST  elicits  greater 
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predictive  ability  of  fatigued  performance  than  either  approach  alone.  These  results  are  based  on  analysis  of  raw 
data  from  Flight  Fit  and  PMI  Fit  2000;  normative  scores  provided  by  the  manufacturers’  algorithms  did  not  yield 
significant  results.  Therefore,  the  current  algorithms  used  to  score  Flight  Fit  and  PMI  Fit  2000  must  be  adjusted  to 
reflect  use  in  a  Naval  Aviation  population.  With  that  adjustment,  both  Flight  Fit  and  PMI  Fit  2000  show  promise  as 
valid  real-time  readiness-to-fiy  assessment  tools  in  Naval  Aviation  squadrons.  Follow-on  studies  to  address  scoring 
adjustment,  as  well  as  validation  in  a  wider  array  of  fatigue  conditions  (i.e.,  chronic,  cumulative  sleep  debt)  are 
discussed. 


INTRODUCTION 
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The  negative  impaet  of  fatigue  is  well  known.  Fatigue  due  to  sleep  loss  eauses  slowed  physiologieal  and 
eognitive  reaetion  time,  memory  problems,  and  inereased  mistakes  during  even  routine  deeision  making  (Caldwell 
et  al,  2009).  Pilots  and  airerew  routinely  report  feeling  fatigued  in  the  eoekpit  (e.g.,  Belland  &  Bissell,  1994),  and 
on  an  objeetive  level,  pronouneed  neurologieal  and  physiologieal  deerements  have  been  assoeiated  with  both 
ehronie  (Van  Dongen,  Rogers,  &  Dinges,  2003)  and  aeute  (Caldwell,  2005)  sleep  loss.  The  Navy  has  doeumented 
fatigue  issues  in  relation  to  aviation  for  over  50  years  (see  Graybiel,  Brown,  &  Crispell,1943;  Graybiel,  Horwitz,  & 
Gates,  1944).  Today,  fatigue  is  the  most  frequently  eited  physiologieal  faetor  eontributing  to  the  oeeurrenee  of  US 
Naval  Aviation  flight  mishaps  (Naval  Safety  Center,  2006),  eosting  hundreds  of  millions  of  dollars  in  lost  equipment 
and  the  inealeulable  eost  of  lost  human  life. 

Aeeordingly,  the  Navy  and  other  military  serviees  have  invested  signifieant  resourees  in  the  development  of 
means  to  manage  and  mitigate  fatigue  in  operational  settings.  Mitigation  teehniques  have  largely  foeused  on 
pharmaeologie  eountermeasures,  for  example,  eaffeine  and  dextroamphetamine.  Pharmaeologie  interventions 
eontinue  to  improve,  inereasing  effieaey  while  deereasing  adverse  side  effeets.  An  exeellent  example  of  sueh  an 
intervention  is  the  drug  modafanil  and  its  extended  aetion  formulation,  armodafanil  (Phillips,  Arnold,  Strompolis,  & 
Simmons,  2009).  Modafanil  and  its  variant  have  demonstrated  many  of  the  benefits  of  traditional  stimulants  already 
used  by  military  eommunities  without  severely  affeeting  normal  sleep  patterns  or  appetite.  Modafanil  also  appears  to 
have  a  lower  potential  for  abuse  than  eurrently  used  stimulants  (Lyons  &  Freneh,  1991;  Myriek,  Maleom,  Taylor,  & 
LaRow,  2004).  Even  with  the  advaneement  of  mitigation  iQc\miqu.QS,  prevention  of  the  fatigue  state  remains  ideal. 
Efforts  at  prevention  are  eentered  on  fatigue  management  through  the  use  of  predietive  modeling  and  seheduling 
tools.  These  inelude  duty  hours,  rotations  and  flight  times  used  to  inform  mission  seheduling  to  minimize  fatigue 
and  improve  safety  and  operational  effeetiveness.  In  a  position  paper  reeently  adopted  by  the  Aerospaee  Medieal 
Assoeiation,  Caldwell  and  eolleagues  (2009)  outline  two  major  types  of  fatigue  management  teehnologies:  1)  on¬ 
line  real  time  assessment,  and  2)  off-line  fatigue  predietion  algorithms. 

On-line,  real-time  assessment  of  fatigue  foeuses  on  eontinuous  traeking  of  physiologie  markers  sensitive  to 
fatigue  to  ealeulate  when  an  individual  falls  below  an  aeeeptable  level  of  alertness.  Eor  instanee,  the  Pereentage  of 
Eye  Closure  (PERCEOS)  metrie  assesses  the  frequeney  and  duration  of  slow  eye  blinks,  a  behavior  strongly  linked 
to  drowsiness  and  sleep  loss-related  fatigue  (Dinges,  Mallis  Maislin,  &  Powell,  1998).  The  PERCEOS  metrie  is 
employed  in  several  eurrently  available  eommereial  fatigue  deteetors  that  all  operate  similarly.  Onee  an  undesirable 
level  is  reaehed  in  the  PERCEOS  metrie,  the  operator  is  notified  that  the  subjeet  has  reaehed  an  unsafe  fatigue  level, 
usually  by  an  alarm.  The  alarm  is  intended  to  reorient  the  individual  long  enough  to  take  effeetive  eountermeasures. 
Other  physiologie  indieators,  sueh  as  eye  gaze,  head  position,  aetigraphy,  and  EEG  have  been  employed  in  real  time 
monitoring  systems  to  good  effeet  (Caldwell  et  al,  2009).  However,  the  diffieulty  of  positioning  and  maintaining 
these  systems  in  high  tempo  operational  environments  makes  their  transition  to  military  applieation  problematie.  Eor 
example,  PERCEOS  and  other  oeulometries  rely  on  relatively  stable  head  positioning  for  aeouraey,  a  funetionally 
impossible  eondition  in  the  eoekpit.  Eurther,  deteetion  of  fatigue  at  its  onset  is  not  operationally  ideal;  by  the  time 
defieits  are  deteetable,  performanee  is  already  eompromised.  This  eonsideration  has  driven  the  development  of 
fatigue  predietion  algorithms,  so  that  performanee  defieits  may  be  antieipated  before  they  oeeur. 

The  use  of  off-line  fatigue  predietion  algorithms  is  well  illustrated  by  the  Eatigue  Avoidanee  Seheduling 
Tool  (EAST),  a  program  designed  to  measure,  estimate,  and  manage  performanee  ehanges  indueed  by  sleep 
restrietion  or  deprivation  and  time  of  day.  The  performanee  predietions  are  based  on  the  Sleep,  Aetivity,  Eatigue, 
and  Task  Effeetiveness  (SAETE™)  Model,  extensive  field  data,  and  sleep  deprivation  studies  (Hursh  et  al.,  2004). 
The  output  for  EAST  ineludes  a  predietion  of  performanee  effeetiveness  represented  as  a  relative  departure  from 
baseline  funetioning  aeross  the  eourse  of  a  day.  Eor  ease  of  interpretation,  the  performanee  seores  given  by  EAST 
ean  be  equated  to  blood  aleohol  level  (BAL),  a  metrie  with  well-known  eognitive  and  physiologieal  performanee 
eorrelates.  The  EAST  software  ean  be  used  with  aetigraphy  or  data  ean  be  entered  from  a  self-report  of  the 
individual’s  sleep/wake  eyele.  Performanee  predietion  by  EAST  ineludes  a  few  key  assumptions,  most  notably  that 
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all  individuals  have  highly  similar  circadian  rhythms  and  fatigue  responses.  While  these  assumptions  are  based  on 
group  normative  data,  recent  evidence  suggests  large  individual  differences  in  fatigue  resistance  exist  (Van  Dongen, 
Baynard,  Maislin  &  Binges,  2004),  and  that  these  differences  may  be  connected  to  aspects  of  basic  cognitive 
functioning  (Killgore  et  al.,  2009). 

Inter-individual  variations  in  fatigue  response  highlight  the  need  to  supplement  general  models  of  fatigue, 
such  as  the  SAFTE^m  Model,  with  individualized  fatigue  measurement  and  modeling  using  a  combination  of 
cognitive  and  physiological  factors.  Currently  there  is  not  an  established  tool  available  to  assess  an  individual 
aviator’s  actual  real-time  level  of  fatigue  or  general  physiological  readiness  in  this  capacity.  Accordingly,  the  Naval 
Safety  Center  (NSC)  has  identified  the  need  for  a  quickly-administered  individualized  fatigue  assessment  tool  to 
determine  a  pilot  or  aircrew  member’s  readiness-to-fiy.  The  current  report  documents  testing  of  two  potential 
instruments  to  fill  that  need,  the  Flight  Fit  cognitive  fatigue  assessment  and  the  PMI  FIT  2000  physiological  fatigue 
assessment  tools. 


METHOD 


Subjects 

Fifteen  active  duty  military  personnel  from  the  Naval  Aviation  Preflight  Indoctrination  (API)  program 
volunteered  as  test  subjects.  The  study  protocol  was  approved  by  the  Naval  Aerospace  Medical  Research  Faboratory 
Institutional  Review  Board  in  compliance  with  all  applicable  Federal  regulations  governing  the  protection  of  human 
subjects.  Descriptive  statistics  for  the  subjects  are  presented  in  Table  1. 

No  specific  groups  were  excluded.  However,  certain  factors  identified  via  a  medical  history  form  (Appendix 
B),  served  to  exclude  individual  participants,  due  to  their  potential  confounding  effects.  These  included  excessive 
alcohol  use  within  the  previous  48  hours  (>3  drinks),  greater  than  400mg  of  routine  daily  caffeine  consumption, 
habitual  use  of  tobacco  products  within  the  previous  six  months,  and  history  of  significant  medical,  neurological, 
psychiatric,  or  sleep-related  problems  (Killgore,  et  al.,  2009). 


Table  1.  Descriptive  Statistics 


Age  (years) _ Height  (in) _ Weight  (lbs) 


Mean 

SD 

Mean 

SD 

Mean 

SD 

Male  (n  =  13) 

24.7 

2.1 

71.2 

3.3 

186.6 

20.0 

Female  (n  =  2) 

21.5 

0.7 

66.5 

3.5 

142.5 

17.7 

Total 

24.3 

2.3 

70.5 

3.6 

180.7 

24.6 

Ethnicity  White  Black  Asian  American  Hispanic/Fatino(a)  Other 


11  2  0  2  0 


Fatigue  Assessments 

Flight  Fit.  The  Flight  Fit  cognitive  test  battery  is  an  abbreviated  (7  to  8  minute)  version  of  the  CogniFit 
assessment  battery  (full  version  is  approximately  30  minutes)  (Cognifit  Inc.,  Yoqneam  Hit,  Israel).  The  test 
measures  cognitive  performance  on  various  components  of  mental  work  load  sensitive  to  the  effects  of  fatigue. 
Specifically,  Flight  Fit  (FF)  measures  raw  reaction  time  (FF  rawRT),  visual  scanning  reaction  time  (FF  vsRT), 
visual  scanning  accuracy  (FF_vsACC),  divided  attention  reaction  time  (FF  daRT),  divided  attention  accuracy 
(FF  daACC),  shifting  reaction  time  (FF  SRT),  attention  shifting  accuracy  (FF  shiftACC),  focus  reaction  time  in 
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the  presence  of  distracters  (FF  fdRT),  and  short-term  memory  (FF  STM)  (see  Appendix  A  for  a  complete  listing 
of  variable  abbreviations). 

PMI Fit  2000.  The  PMI  FIT  2000  (PMI  Inc.,  Rockville,  MD)  uses  eye-tracking  and  pupillometry  to  identify 
impaired  physiological  states  due  to  fatigue  and  other  factors,  such  as  alcohol  or  drug  use.  The  test  requires  less 
than  one  minute  to  complete.  The  system  employs  an  algorithm  that  compares  an  individual’s  established  baseline 
to  present  state  on  4  variables  (i.e.,  pupil  diameter,  pupil  constriction  amplitude,  pupil  constriction  latency  & 
saccadic  velocity).  The  baseline  is  established  by  the  average  of  10  trials  taken  during  non-impaired  conditions. 

After  the  baseline  trials,  each  subsequent  trial  provides  the  user  with  scores  on  the  four  test  components  plus  a 
composite  score,  the  FIT  Index.  The  PMI  FIT  2000  has  been  used  in  multiple  fatigue  and  impairment  studies  in 
other  contexts,  such  as  motor  vehicle  operation,  and  has  been  demonstrated  to  be  both  reliable  and  valid  (e.g.,  Russo 
etak,  1999). 

Psychomotor  Vigilance  Task.  The  PVT-192  (Ambulatory  Monitoring  Inc.,  Ardsley,  New  York)  is  a  brief 
vigilance  and  attention  task,  and  is  considered  the  gold  standard  instrument  for  assessment  of  the  effects  of  fatigue 
(Balkin  et  ah,  2004).  During  each  10  minute  trial,  subjects  are  required  to  attend  closely  to  a  stimulus  window  and 
respond  by  pressing  a  response  button.  Subjects  are  instructed  to  respond  as  quickly  as  possible.  PVT  scores  of 
interest  include  mean  reciprocal  reaction  time  of  the  slowest  10%  of  responses  (Mean  S  RRT),  and  lapses  (responses 
to  stimulus  presentations  taking  longer  than  500  ms). 

SynWin.  SynWin  is  a  computer-based  test  module  that  simulates  a  work  environment  by  presenting  up  to 
four  tasks  on  the  screen  simultaneously.  These  tasks  include  versions  of  the  Sternberg  Memory  Task,  mathematical 
calculation,  gauge  monitoring,  and  auditory  vigilance.  Each  10  minute  trial  is  scored  individually,  and  combines  the 
participant’s  performance  on  all  administered  tasks  into  a  single  proficiency  score. 

Flight  Simulation  (X-Plane  9).  Simulated  flight  performance  was  measured  using  the  X-Plane  9  (Laminar 
Research)  flight  simulator.  Because  fatigue  impairs  basic  attentional  processes,  simple  tasks  which  are  subject  to 
more  reliable  measurement  were  the  focus  of  simulated  flight  performance.  Specifically,  subjects  were  given  a 
simple  flight  profile,  with  instructions  to  fly  “straight  and  level”  at  a  specified  altitude,  airspeed  and  heading  (i.e., 
2000  ft,  140  knots,  due  North).  Deviations  from  these  specified  flight  parameters  were  assessed. 

Stanford  Sleepiness  Scale.  Subjective  sleepiness  was  assessed  with  the  Stanford  Sleepiness  Scale  (SSS), 
(Hoddes,  Dement  &  Zarcone,  1972).  The  available  scores  for  the  SSS  range  from  1  (“Feeling  active,  vital,  alert,  or 
wide  awake)  to  7  (“No  longer  fighting  sleep,  sleep  onset  soon;  having  dream  like  thoughts”).  There  is  also  a  means 
to  denote  if  the  subject  is  “Asleep”,  with  the  score  of  “X”.  The  SSS  is  a  widely  used,  easy-to-administer  paper-and- 
pencil  measure  and  has  demonstrated  excellent  sensitivity  to  the  effects  of  fatigue  (Balkin  et  ah,  2004). 

Fatigue  Avoidance  Scheduling  Tool.  The  Fatigue  Avoidance  Scheduling  Tool  (FAST;  Nova  Scientific 
Corporation,  Fairborn,  OH)  is  software  designed  to  measure,  estimate  and  manage  performance  changes  induced  by 
sleep  restriction  or  deprivation  and  time  of  day.  The  primary  use  of  FAST  is  to  optimize  the  operational 
management  of  aviation  crews  and  to  design  work  schedules  and  mission-critical  events  in  a  manner  that  will  reduce 
fatigue  and  fatigue  induced  errors.  The  performance  predictions  are  based  on  the  Sleep,  Activity,  Fatigue,  and  Task 
Effectiveness  (SAETE™)  Model,  numerous  laboratory  collaborations,  field  data  collection,  and  sleep  deprivation 
studies  (Hursh  et  ah,  2004).  The  output  for  EAST  includes  a  prediction  of  performance  effectiveness,  which  can 
also  be  used  to  extrapolate  a  blood  alcohol  level  (BAL).  The  EAST  software  can  be  used  with  actigraphy  or  data 
can  be  entered  from  a  self-report  of  the  individual’s  sleep/wake  cycle. 

Design 


The  experiment  employed  a  repeated  measures  design  to  investigate  the  effects  of  sleep  deprivation  on 
physiological  state  and  task  performance  over  time.  The  experiment  consisted  of  two  phases,  (1)  the  Practice  Phase 
and  (2)  the  Experimental/Sleep  Deprivation  Phase. 


Procedures 


8 


Practice  Phase.  Up  to  four  (4)  volunteers  were  reeruited  during  eaeh  week  of  the  study.  After  receipt  of 
participants’  informed  consent,  the  Practice  Phase  of  the  experiment  began.  This  phase  was  executed  Monday  and 
Tuesday  morning  and  required  approximately  90  minutes  of  participation  each  day.  Practice  Phase  data  was  used 
for  each  of  the  measures  to  establish  performance  asymptote  and  to  mitigate  practice  effects  during  the 
Experimental/Sleep  Deprivation  phase.  Each  day  participants  completed:  5  trials  of  the  PMI  EIT  2000,  2  trials  of 
Elight  Eit,  2  trials  of  the  PVT,  three  10-minute  trials  of  SynWin,  one  15-minute  trial  of  the  X-Plane  Plight  Simulator 
and  the  SSS.  Prior  to  departing  Monday  morning,  each  subject  was  outfitted  with  a  Motionlogger  Microsleep  Watch 
(Ambulatory  Monitoring,  Inc.,  Ardsley,  New  York),  which  was  used  to  monitor  sleep  and  wake  periods  while  not 
under  observation. 

Experimental/Sleep  Deprivation  Phase.  Upon  completion  of  Tuesday  morning  Practice  Phase,  subjects 
were  released  with  instructions  to  return  at  0530  Wednesday  morning.  Subjects  were  instructed  to  sleep  according 
to  their  normal  schedules,  and  to  awaken  at  0300  Wednesday,  remaining  awake  until  the  0530  report  time. 
Compliance  was  gauged  by  actigraphy.  Subjects  were  also  re-familiarized  with  the  protocol  for  the  sleep 
deprivation  phase  of  the  study.  Beginning  at  0600  subjects  were  assessed  on  Plight  Pit,  PMI  PIT  2000,  PVT, 
SynWin,  SSS  and  the  flight  simulator  task  once  every  three  (3)  hours,  as  follows:  1  trial  of  PMI  PIT  2000,  I  trial  of 
Plight  Pit,  1  trial  of  PVT,  1  administration  of  SSS,  1  trial  of  the  simulated  flight  profile  and  1  trial  of  SynWin.  Trials 
began  at  0600,  0900,  1200,  1500,  1800,  2100,  0000  (Thursday),  and  0300.  Upon  completion  of  the  final  trial, 
subjects  were  debriefed  and  driven  to  the  Bachelor  Officers’  Quarters  (BOQ)  with  instructions  to  obtain  adequate 
sleep  prior  to  check  out. 


ANALYSES  AND  RESULTS 


Overview 

Three  stages  of  data  analysis  were  conducted  in  order  to  examine  group  and  individual  patterns  of  fatigue- 
related  performance  decrements.  In  Stage  I,  a  series  of  Repeated  Measures  ANOVAs  was  conducted  for  each 
criterion  and  predictor  variable  over  the  8  Experimental  Phase  trials  to  determine  which  variables  exhibited  change 
across  time.  Significant  change  in  predictor  variables  across  time  established  their  sensitivity  to  fatigue  on  a  group 
level.  Displaying  change  across  time  for  criterion  variables,  such  as  PVT  Papses,  is  necessary  in  order  to  establish 
those  variables  as  fatigue -related,  and  therefore  appropriate  as  outcome  measures  for  Stage  2  predictive  models.  In 
Stage  2,  a  series  of  Hierarchical  Pinear  Models  (HPMs)  was  conducted  to  predict  performance  decrements 
associated  with  fatigue,  and  to  simultaneously  examine  any  individual  differences  that  were  not  evident  at  the  group 
level  analyses.  Bivariate  and  multiple  predictor  models  were  examined.  In  Stage  3  we  constructed  several  multiple 
predictor  General  Pinear  Models  (GPMs)  from  significant  Stage  2  predictor  variables  to  formulate  optimum  group- 
based  scoring  algorithms  for  fatigue -related  performance  decrements  using  Plight  Pit  and  PMI  Pit  2000  components 


Stage  1 

Stage  I  analyses  were  performed  using  SPSS  version  16.0  for  Windows  (SPSS  Inc.,  Chicago,  IP).  A  series 
of  Repeated  Measures  ANOVAs  was  conducted  for  each  dependent  variable  over  the  8  Experimental  Phase  trials. 
The  0600  trial  of  the  Experimental/Sleep  Deprivation  Phase  was  established  as  baseline  performance.  A  value  of  p 
<  0.05  was  considered  statistically  significant.  The  following  section  describes  each  measure,  and  the  variables 
assessed. 


Predictor  Variables 
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Flight  Fit.  There  were  9  sub-scores  for  each  trial  of  Flight  Fit:  FF_rawRT,  FF  vsRT,  FF_vsACC,  FF  STM, 

FF  daACC,  FF  SRT,  FF  shiftACC,  FF  fdRT,  and  FF  daRT.  Results  indicate  that  four  Flight  Fit  sub-components 
detected  significant  fatigue  effects  across  trials,  including,  FF_rawRT,  FF  STM,  FF  daACC  and,  FF  shiftACC. 
Post-hoc  analyses  revealed  significant  decreases  in  performance  on  these  four  measures  indicative  of  fatigue  effects, 
with  the  most  dramatic  detriments  occurring  during  the  last  two  assessment  periods  (0000  and  0300  hours).  ANOVA 
results  are  presented  in  Table  2,  and  mean  performance  scores  over  assessment  times  for  each  significant  sub-score 
are  presented  in  Figures  1-4.  Significant  sub-components  were  retained  in  Stage  2  analyses  to  be  evaluated  as 
predictor  variables. 


Table  2.  ANOVA  results  for  Flight  Fit  Sub-Scores 


F 

df 

P 

FF  rawRT 

2.86 

(7,  98) 

.009 

.17 

FFfdRT  ^ 

.82 

(4.60,  64.52) 

.533 

.06 

FFvsACC  ^ 

1.64 

(4.36,  60.97) 

.172 

.11 

FFvsRT 

1.71 

(7,  98) 

.114 

.11 

FF_STM 

2.45 

(7,  98) 

.023 

.15 

FFdaACC 

2.41 

(7,  98) 

.026 

.15 

FFdaRT  ^ 

1.69 

(3.60,  50.34) 

.172 

.11 

FFshiftACC 

3.49 

(7,  98) 

.002 

.20 

FF  SRT 

1.72 

(7,  98) 

.113 

.11 

Geisser-Greenhouse  correction  used  due  to  violation  of  sphericity 


Figure  1.  Mean  Flight  Fit  Raw  Reaction  Time  (FF  rawRT)  scores  in 
milliseconds  at  each  test  trial  across  time.  Post-hoc  analyses  revealed 
significant  differences  between  T1  and  T4  through  T8;  T5  and  Tl,  T7,  and 
T8;  T6  and  Tl,  T7,  and  T8;  T7  and  Tl,  T3,  T4,  T5,  and  T6;  T8  and  Tl,  T3, 
T4,  T5,  and  T6.  The  most  operationally  significant  differences,  between  T6 
and  T7  -  T8,  are  noted  (*). 


Figure  2.  Mean  Flight  Fit  Divided  Attention  Accuracy  (FF  daACC)  scores  in 
percent  correct  at  each  test  trial  across  time.  Post-hoc  analyses  revealed 
significant  differences  between  Tl  and  T4  -  T8;  T5  and  Tl,  T7,  and  T8;  T6 
and  Tl,  T7,  and  T8;  T7  and  Tl,  T5,  and  T6;  T8  and  Tl,  T5,  and  T6.  As  with 
FF-rawRT,  the  most  operationally  significant  differences,  between  T6  and  T7 
-  T8,  are  noted  (*). 
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Figure  3.  Mean  Flight  Fit  Short  Term  Memory  performance  (FF  STM)  in 
number  of  successfully  memorized  items  at  each  test  trial  across  time.  Post- 
hoc  analyses  revealed  significant  differences  between  T8  and  Tl,  T2,  and  T6, 
indicating  that  FF  STM  performance  was  relatively  stable  until  a  significant 
decline  at  the  24  mark  of  continual  wakefulness  (*). 


Figure  4.  Mean  Flight  Fit  Shifting  Accuracy  (FF  shiftACC)  scores  in  percent 
correct  at  each  test  trial  across  time.  Post-hoc  analyses  revealed  significant 
differences  between  T4  and  T7,  T8;  T8  and  T2,  T3,  T4,  T5,  and  T6,  indicating  that 
FF  shiftACC  performance  declined  significantly  and  steadily  from  the  T4  time  slot 
until  the  final  test  trial  (*). 


PMI  Fit  2000.  There  are  four  eomponents  of  the  FIT  Index:  pupil  diameter,  pupil  eonstrietion  amplitude, 
pupil  eonstrietion  lateney  and  saceadie  veloeity.  Results  are  displayed  in  Table  3  and  Figures  5-9.  Although  the  FIT 
Index  failed  to  deteet  fatigue  effects  across  the  experimental  time  points,  one  subcomponent  of  the  FIT  Index, 
saccadic  velocity,  appears  especially  sensitive  to  the  effects  of  fatigue  (Figure  9).  While  significant  effects  were 
present  for  amplitude  as  well,  examination  of  post-hoc  tests  revealed  patterns  that  do  not  suggest  that  effects  were 
associated  with  fatigue  (Figure  6).  As  a  result,  only  saccadic  velocity  was  retained  as  a  predictor  variable  in  Stage  2 
analyses. 


Table  3.  ANOVA  results  for  PMI  2000 


Diameter^ 

F 

.613 

df 

(3.79,53.10) 

P 

.647 

.042 

Amplitude 

4.93 

(7,  98) 

.000* 

.260 

Latency 

1.41 

(7,  98) 

.211 

.091 

Saccadic  Velocity^ 

8.88 

(3.24,  45.34) 

.000* 

.388 

FIT  Index  ^ 

.774 

(2.70,  37.76) 

.503 

.052 

^  Geisser-Greenhouse  correction  used  due  to  violation  of  sphericity 
*  Mean  difference  significant  at  the  .05  level 


omnibus  test  of  the  effect  was  not  significant  (Table  3);  therefore,  post- 
hoc  analyses  were  not  conducted. 


Figure  6.  Mean  PMl  Pupil  Constriction  Amplitude  in  millimeters  at  each 
test  trial  across  time.  Post-hoc  analyses  revealed  significant  differences 
between  T1  and  T2  -  T6;  T2  -  T5  and  Tl,  TV,  and  T8;  T6  and  Tl;  TV  and 
T2,  T3,  T4,  and  T5;  T8  and  T2,  T4,  and  T5.  The  most  operationally 
significant  differences,  between  T5  and  TV  -  T8,  are  noted  (*). 


Figure  8.  Mean  PMl  Pupil  Constriction  Latency  in  milliseconds  at  each 
Figure  7.  Mean  PMl  Pupil  Diameter  in  millimeters  at  each  test  trial  across  -p^e  omnibus  test  of  the  effect  was  not  significant 

time.  The  omnibus  test  of  the  effect  was  not  significant  (Table  3);  therefore,  (^^ble  3);  therefore,  post-hoc  analyses  were  not  conducted, 

post-hoc  analyses  were  not  conducted. 


Figure  9.  Mean  PMl  Saccadic  Velocity  (PMI  SV)  in  millimeters  per 
second  at  each  test  trial  across  time.  Post-hoc  analyses  revealed  significant 
differences  between  Tl  and  TV,  T8;  T2  and  T6,  T8;  T3  and  T6  -  T8;  T4 
and  TV,  T8;  T5  and  TV,  T8;  T6  and  T2,  T3,  TV,  and  T8;  TV  and  Tl  -  T6; 
T8  and  Tl  -  TV,  indicating  that  PMI  SV  speed  dropped  significantly  and 
steadily  from  the  T3  trial  until  the  TV  trial.  The  most  operationally 
significant  drop,  from  T6  to  TV,  is  noted  (*). 


Criterion  Variables 
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Psychomotor  Vigilance  Task.  There  were  two  variables  of  interest  from  the  PVT :  Mean  S  RRT  and  the 
number  of  lapses  per  trial.  Results  indieate  significant  fatigue  effects  for  lapses  and  Mean  S  RRT,  see  Table  4  and 
Figures  10  and  1 1  for  details.  As  lapses  are  both  a  fatigue  literature  gold  standard  and  an  operationally  relevant 
vigilance  analogue,  it  was  included  as  the  primary  criterion  variable  in  Stage  2  and  3  analyses. 


Table  4.  ANOVA  results  for  PVT 


F 

df 

P 

PVT  Lapses^ 

6.88 

(1.45,20.28) 

.009* 

.329 

Mean  S  RRT 

9.36 

(7,  98) 

.000* 

.401 

^  Geisser-Greenhouse  correction  used  due  to  violation  of  sphericity 
*  Mean  difference  significant  at  the  .05  level 


Figure  10.  Mean  PVT  Lapses  at  each  test  trial  across  time.  Post-hoc 
analyses  revealed  significant  differences  between  T8  and  all  other 
trials,  indicating  a  distinct  point  at  which  group  vigilance  began  to 
fail  (*). 


Figure  11.  Mean  reciprocal  reaction  time  of  the  slowest  10%  of  responses  (Mean 
S  RRT)  for  the  PVT  at  each  test  trial  across  time.  Post-hoc  analyses  revealed 
significant  differences  between  T1  and  T8;  T2  and  T8;  T3  and  T7,  T8;  T4  and 
T7,  T8;  T5  and  T6,  T7,  and  T8;  T6  and  T5,  T8;  T7  and  T3,  T4,  T5,  and  T8;  T8 
and  T1  -  T7.  This  pattern  is  extremely  similar  to  PVT  lapses,  with  the  final  trial 
significantly  slower  than  all  other  trials  (*). 


SynWin.  Results  revealing  significant  differences  across  time  for  the  composite  scores  are  displayed  in  Table 
5  and  Figure  12.  Although  significant  effects  were  found  for  assessment  time,  these  effects  appear  to  be 
associated  with  dramatic  performance  variation  from  assessment  to  assessment  as  opposed  to  effects  that  can 
be  clearly  explained  by  fatigue  (see  figure  12).  Though  a  marked  performance  decrement  does  appear  from 
2100  to  0300,  the  lack  of  consistency  across  time  makes  SynWin  unsuitable  as  an  outcome  measure  for  further 
analyses. 


Table  5.  ANOVA  results  for  SynWin  composite  score 
F  df 

_ ATO _ (7,98) _ 

*  Mean  difference  significant  at  the  .05  level 


hp 


2 


.251 


P 

.000* 
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Figure  12.  Mean  SynWin  Composite  Score  at  each  test  trial  across  time.  Post-hoc 
analyses  revealed  significant  differences  between  T1  and  T2,  T4,  and  T6;  T2  and  Tl, 
T7,  and  T8;  T3  and  T4,  T6;  T4  and  Tl,  T3,  T5,  T7,  and  T8;  T5  and  T4,  T6;  T6  and  Tl 
T3,  T5,  T7,  and  T8;  T7  and  T2,  T4,  and  T6;  T8  and  T2,  T4,  and  T6.  The  pattern  of 
differences,  though  statistically  significant,  was  not  operationally  interpretable  in  this 
context  due  to  lack  of  consistency  across  time. 


Flight  Simulation  (X-Plane  9). 

Calculation  of  Total  Lapse  Time.  Deviations  from  the  specified  flight  parameter  goals  for  heading  (due  North), 
airspeed  (140  kts),  and  elevation  (2000  ft)  were  calculated  separately.  Lapse  times  were  calculated  for  each 
parameter  as  the  number  of  seconds  during  a  simulator  trial  that  subjects  deviated  from  the  flight  goal  by  greater 
than  one  standard  deviation  (determined  at  baseline).  Total  lapse  time  was  the  sum  of  lapse  times  for  each 
parameter.  The  analysis  revealed  dramatic  and  significant  effects  of  assessment  time  on  total  lapse  time  suggesting 
that  total  laps  time  is  sensitive  to  fatigue  effects.  Results  are  displayed  in  Table  6  and  Figure  13.  These  initial  results 
suggest  that  it  is  possible  to  construct  an  ecologically  valid  measure  of  vigilance  using  low  cost,  commercially 
available  flight  simulation.  Though  promising,  these  are  preliminary  results  only;  further  validation  across  time  and 
varying  situations  is  needed  before  Flight  Simulator  lapses  can  be  used  as  an  outcome  measure  with  the  same 
confidence  as  PVT  lapses. 


Table  6.  ANOVA  results  for  Flight  Simulator  Total  Lapse  Time^ _ 

F  df  r|p^ 

2.53 _ (7,  98) _ 3)2 _ .If 

^  Geisser-Greenhouse  correction  used  due  to  violation  of  sphericity 


Stanford  Sleepiness  Scale  (SSS).  Results  for  the  SSS  scores  show  that  there  was  a  significant  main  effect  of 
assessment  time.  Post  hoc  comparisons  showed  significant  differences  between  levels,  the  most  revealing  between 
Trials  6,  7,  and  8  and  all  other  Trials  (see  Table  7  and  Figure  14),  with  individuals  reporting  greater  sleepiness 
linearly  across  time.  However,  any  self-reported  subjective  state  has  significant  drawbacks  as  a  performance 
criterion  variable,  including  the  possible  influences  of  demand  characteristics,  variability  in  individual  interpretation 
of  the  question,  and  intentional  misreporting  or  deception. 

Table  7.  ANOVA  results  for  Stanford  Sleepiness  Scale _ 

F  df  r|p^ 

26.30 _ (9,  126) _ <  .000001* _ ^53 _ 

*  Mean  difference  significant  at  the  .05  level 


14 


Figure  13.  Mean  Total  Lapse  Time  on  Flight  Simulator  Performance  in  seconds 
at  each  test  trial  across  time.  Post-hoc  analyses  revealed  significant  differences 
between  T8  and  all  other  trials,  indicating  a  distinct  point  at  which  group 
vigilance  began  to  fail  (*).  Notably,  this  pattern  is  highly  similar  to  PVT  lapses 
(Figure  10). 


Figure  14.  Mean  Stanford  Sleepiness  Scale  (SSS)  score  at  each  test  trial  across  time. 
Post-hoc  analyses  revealed  significant  differences  between  T6,  T7,  and  T8  and  all 
other  trials,  indicating  that  participants  felt  significantly  sleepier  as  time  awake 
increased.  The  most  operationally  significant  differences,  between  T6  and  T7  -  T8, 
are  noted  (*). 


Stage  1  Summary 

Stage  1  analyses  were  eondueted  in  order  to  establish  the  sensitivity  of  eaeh  predietor  and  eriterion  variable 
of  interest  to  ehange  aeross  time,  allowing  inferenee  of  the  relation  of  group  averages  on  those  variables  to  time 
spent  without  sleep,  and  henee  fatigue.  These  results  suggest  that  the  prospeets  for  development  of  a  squadron-level 
tool  to  deteet  aviator  fatigue  state  (i.e.  “readiness-to-fly”)  in  real  time  are  good.  However,  the  predieation  of  Stage  1 
analyses  on  a  repeated  measures  design  does  not  provide  information  on  the  eausal  relation  between  predietor  and 
eriterion  variables.  Group  average  based  analyses,  sueh  as  those  performed  in  Stage  1,  also  mask  any  potential 
individual  differenees  in  fatigue  response.  We  therefore  eonstrueted  a  series  of  predietive  Hierarehieal  Linear 
Models  (HLMs)  in  Stage  2,  using  the  framework  provided  by  Stage  1  to  examine  any  potential  individual 
differenees  in  fatigue  response  as  well  as  the  ability  of  eaeh  variable  to  prediet  a  eriterion  measure  of  performanee, 
PVT  lapses.  Among  the  available  performanee  markers,  PVT  lapses  were  deemed  most  appropriate  for  this 
applieation  due  its  extensive  validation  in  the  fatigue  literature  as  well  as  its  operational  relevanee.  For  instanee, 
vigilanee  of  display  ehange  on  the  PVT  response  box  ean  be  likened  to  vigilanee  of  display  ehange  on  a  radar 
sereen.  Stage  2  analyses  are  introdueed  in  more  detail  in  the  next  seetion. 


Stage  2 

Stage  1  longitudinal  analyses  revealed  signifieant  eognitive  and  physiologieal  deerements  attributable  to 
fatigue  on  a  group  level.  When  group  results  were  visually  inspeeted  at  the  individual  level,  two  distinet  elusters  in 
the  data  emerged,  suggesting  the  possibility  of  important  individual  differenees  in  responses  to  fatigue.  In  order  to 
statistieahy  examine  individual  variability  in  fatigue-related  performanee  deerements,  a  series  of  two-level 
Hierarehieal  Linear  Models  was  performed.  An  exploratory  bivariate  analysis  examined  the  ability  of  ah  signifieant 
Stage  1  predietors  to  explain  PVT  lapses  during  the  test  day.  FAST  seores  were  ineluded  in  this  analysis  series,  for 
their  potential  ability  to  prediet  performanee  on  a  group  level  as  well  as  any  possible  individual  differenees  within 
that  ability.  The  results  of  the  bivariate  analyses  informed  subsequent  moderation  and  multiple  predietor  models. 


Bivariate  HLMs 
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For  all  bivariate  HLMs,  fixed  (level  1  equations)  and  random  (level  2  equations)  effeets  of  the  predietors 
were  ineluded,  allowing  determination  of  an  overall  effeet  of  eaeh  predietor,  as  well  as  whether  the  relation  was 
eonsistent  or  varied  aeross  subjeets.  Signifieanee  at  level  1  indieates  a  group  effeet,  while  signifieanee  at  level  2 
indieates  signifieant  individual  differenees  within  that  overall  effeet.  If  the  random  effeet  was  not  signifieant, 
indieating  that  there  was  no  signifieant  inter-individual  variability,  the  model  was  refitted  without  the  random  effeet 
of  the  predietor  in  order  to  focus  on  the  group  effect.  Any  variables  that  exhibited  a  significant  bivariate  relation  at 
level  1,  level  2,  or  both  with  PVT  lapses  are  identified  in  Table  8.  These  include  Time,  FF  rawRT,  FF  daRT, 

FF  shiftACC,  PMl  Fit  2000  Saccadic  Velocity  (PMI  SV),  and  FAST.  The  nature  of  these  effects,  including  graphs 
of  the  individual  slopes  for  each  significant  relation,  is  presented  next. 

Time,  Level  1  and  level  2  equations  were  significant  for  Time  predicting  PVT  Lapses.  The  group  effect 
replicated  the  longitudinal  relation  of  PVT  lapses  across  time  established  in  Stage  1  analyses.  The  effect  at  level  2, 
indicating  significant  individual  differences  about  the  group  slope,  is  an  excellent  illustration  of  the  application  of 
HLM  to  these  data  and  the  importance  of  considering  individual  fatigue  responses  when  predicting  performance. 
Visual  inspection  of  Figure  15  reveals  at  least  two  distinct  groups  in  the  data  when  viewed  as  individually  plotted 
lines.  For  some  subjects,  lapses  increase  at  a  much  faster  rate  across  time  than  for  other  subjects.  Conceptualizing 
this  difference  in  terms  of  fatigue  susceptibility,  individuals  with  high  fatigue-susceptibility  can  be  identified  by 
their  steep  slopes.  Low  fatigue  susceptible  individuals  exhibit  the  opposite  trend,  with  little  to  no  change  in  PVT 
lapses  in  relation  to  Time.  The  difference  between  the  steepness  of  the  two  most  extreme  subject  slopes  and  the  rest 
of  the  group  would  not  be  evident  if  one  line  was  fitted  based  on  a  group  average.  In  this  case  performance 
decrement  would  be  under-predicted  for  individuals  who  were  actually  most  fatigue  susceptible,  and  over-predicted 
for  those  who  were  not.  Operationally  this  could  lead  to  over-utilization  of  performance  compromised  individuals 
and  under-utilization  of  mission-ready  individuals. 


Table  8.  Bivariate  HLMs  Relations  with  Outcome  =  PVT  Lapses 


Level  1 

Level  2 

Variable 

Equation 

t 

df 

P 

Equation 

df 

P 

Time 

Y  =  B0  +  Bl*(Time)  +  R 

2.47 

14 

0.03 

BO  =  GOO  +  UO 

B1  =G10  +  U1 

50.76 

14 

0.00 

FFrawRT 

Y  =  BO  +  B 1  *(FF  RAWRT)  +  R 

2.65 

118 

0.01 

BO  =  GOO  +  UO 

B1  =G10  +  U1 

11.38 

14 

>0.50 

FFdaRT 

Y  =  BO  +  B 1  *(FF  DART)  +  R 

2.10 

118 

0.04 

BO  =  GOO  +  UO 

B1  =G10  +  U1 

13.97 

14 

>0.50 

FFshiftACC 

Y  =  BO  +  B 1  *(FF_shift)  +  R 

-2.76 

14 

0.01 

BO  =  GOO  +  UO 

B1  =G10  +  U1 

38.88 

14 

0.001 

PMISV 

Y  =  BO  +  B 1  *(PMI_SV)  +  R 

-1.99 

14 

0.07 

BO  =  GOO  +  UO 

B1  =G10  +  U1 

25.36 

14 

0.03 

FAST 

Y  =  B0  +  B1*(FAST)  +  R 

-2.82 

14 

0.01 

BO  =  GOO  +  UO 

B1  =G10  +  U1 

276.37 

14 

0.00 

Note.  PVT  =  Psychomotor  Vigilance  Task,  FF  =  Flight  Fit,  rawRT  =  Reaction  Time,  daRT  =  Divided  Attention 
Reaction  Time,  shiftACC  =  Shifting  Accuracy,  PMI  =  Pulse  Medical  Instruments,  SV  =  Saccadic  Velocity,  and  FAST 
=  Fatigue  Avoidance  Scheduling  Took 


Flight  Fit  Raw  Reaction  Time  (FF  rawRT).  Although  visual  inspection  of  Figure  16  may  appear  to  suggest 
that  there  is  a  significant  effect  of  the  predictor  at  level  2,  there  is  not  enough  variability  among  individual  slopes  to 
constitute  a  significant  random  effect.  In  other  words,  visual  variability  does  not  translate  into  statistically  significant 
individual  differences  in  this  case.  This  underscores  the  importance  of  conceptualizing  potential  individual 
differences  both  visually  and  statistically  within  the  context  of  the  variable 
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under  consideration.  Because  the  level  2  equation  was  not  significant  using  a  random  effect,  the  level  1  equation 
reported  here  consists  of  a  re-estimation  of  the  model  without  the  random  effect  (see  Table  8).  The  re-estimated 
model  was  significant  at  level  1  for  FF  rawRT  predicting  PVT  lapses,  such  that  as  reaction  time  increases,  PVT 
lapses  increase.  The  presence  of  a  significant  level  1  effect  in  the  absence  of  a  significant  level  2  effect  indicates  that 
decline  in  FF_rawRT  is  best  conceptualized  on  a  group  level,  with  the  relation  between  FF_rawRT  and  PVT  lapses 
tracking  similarly  for  all  subjects  in  this  study  sample. 

Flight  Fit  Divided  Attention  Reaction  Time  (FF  daRT).  As  with  FF  rawRT,  there  was  no  significant 
random  effect  of  FF  daRT  at  level  2.  Re-estimation  of  the  model  without  the  random  effect  produced  a  significant 
relation  of  FF  daRT  to  PVT  lapses  at  level  1,  such  that  as  divided  attention  reaction  time  increases,  PVT  lapses 
increase  (Figure  17).  This  pattern,  similar  to  what  was  observed  with  FF  rawRT,  indicates  that  no  statistically 
significant  individual  differences  exist  in  our  sample  in  terms  of  performance  on  FF  daRT,  and  that  all  subject 
performance  suffers  similarly  under  fatigued  conditions. 

Flight  Fit  Shifting  Accuracy  (FF  shiftACC).  Level  1  and  level  2  equations  were  significant  for 
FF  shiftACC  predicting  PVT  lapses.  The  significant  level  1  relation  indicates  that  as  shifting  accuracy  decreases, 
PVT  lapses  increase.  Visual  inspection  of  the  significant  inter-slope  variability  at  level  2  shows  that  some 
individuals  exhibit  a  much  broader  range  of  both  PVT  lapses  and  FF  shiftACC  scores  than  others,  with  those 
exhibiting  a  broader  range  of  scores  displaying  the  greatest  fatigue-related  decrement  as  well.  That  is,  individuals 
with  a  relatively  short  plot  line  also  tend  to  have  fiat  slopes,  while  those  with  longer  lines  tend  to  have  steeper 
slopes.  Practically,  this  means  that  individuals  who  show  more  variability  in  their  performance  also  tend  to  perform 
worse  overall.  The  strong  clustering  of  plot  line  end  points  in  the  lower  right  quadrant  of  Figure  18  also  indicates 
that  high  shifting  accuracy  almost  always  translates  to  a  low  number  of  PVT  lapses  for  both  high  and  low  fatigue 
susceptible  individuals. 
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Figure  15.  Individual  subject  slopes  for  PVT  Lapses  across  trial  time 
(group  mean  centered  values).  There  was  a  significant  group  effect  (*) 
and  significant  individual  differences  (+)  such  that,  on  average,  lapses 
increased  as  time  spent  without  sleep  increased;  however,  the  nature  of 
that  relation  varied  significantly  from  subject  to  subject. 


Figure  16.  Individual  subject  slopes  for  PVT  Lapses  in  relation  to 
FF  rawRT  in  milliseconds  (group  mean  centered  values).  There  was 
a  significant  group  effect  (*),  but  no  significant  differences  were 
observed  among  individual  slopes,  indicating  that  the  relation 
between  FF  rawRT  and  PVT  Lapses  is  similar  for  all  subjects  in  the 
sample. 
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Figure  17.  Individual  subject  slopes  for  PVT  Lapses  in  relation  to 
FF  daRT  in  milliseconds  (group  mean  centered  values).  There  was  a 
significant  group  effect  (*),  but  no  significant  differences  were  observed 
among  individual  slopes.  As  with  FF  rawRT,  this  indicates  that  the 
relation  between  FF  daRT  and  PVT  Lapses  is  similar  for  all  subjects  in 
the  sample. 


Figure  18.  Individual  slopes  for  PVT  Lapses  in  relation  to 
FF  shiftACC  in  percent  (group  mean  centered  values).  There  was  a 
significant  group  effect  (*)  and  significant  individual  differences  (+) 
such  that,  on  average,  lapses  increased  as  shifting  accuracy  decreased; 
however,  the  nature  of  that  relation  varied  significantly  from  subject  to 
subject. 


PMI  Saccadic  Velocity  (PMI  SV).  The  level  1  equation  was  signifieant  for  PM1_SV  sueh  that  as  saeoadic 
veloeity  decreases,  PVT  lapses  increase.  There  was  also  a  significant  random  effect  of  the  predictor  at  level  2. 

Visual  inspection  of  the  significant  inter-slope  variability  at  level  2  reveals  a  similar  pattern  to  FF_shiftACC,  though 
the  dichotomy  between  high  and  low  fatigue  susceptibility  is  not  as  clear.  Individuals  who  show  more  variability  in 
their  performance  tend  to  perform  worse  overall,  though  this  trend  is  not  as  strongly  tied  to  low  scores  in  the 
predictor  variable  as  it  is  with  FF  shiftACC.  Some  individuals  with  relatively  slow  saccadic  velocity  commit  few 
lapses.  The  presence  of  these  individuals,  who  contrast  the  general  relation  of  slow  saccadic  velocity  equaling  more 
lapses,  highlights  the  dynamic  role  of  performance  baseline  and  individual  variation  in  saccadic  velocity  in  relation 
to  fatigue  progression.  In  terms  of  baseline  performance,  individuals  who  start  with  few  to  no  lapses  tend  to  stay  that 
way;  graphically  these  are  the  plot  lines  with  low  intercepts.  Individuals  with  higher  intercepts,  and  therefore  poorer 
baseline  performance,  tend  to  get  worse  across  trials.  The  possible  role  of  baseline  performance  on  a  measure  as  a 
predictor  of  fatigue -related  decline  across  time  will  be  discussed  in  more  detail  in  the  section  Moderation  HLMs 
using  Baseline  PVT  Performance.  In  terms  of  individual  variation,  the  fact  that  slow  saccadic  velocity  can  be,  but 
isn’t  always,  associated  with  a  high  number  of  lapses  further  emphasizes  the  need  for  establishing  individual 
baselines  in  physiological  fatigue  measures  (Figure  19). 

FAST,  The  level  1  equation  was  significant  for  FAST,  indicating  that  as  FAST  predicts  a  drop  in 
performance,  a  drop  in  PVT  vigilance  occurs.  There  was  also  a  significant  random  effect  of  the  predictor  at  level  2. 
As  with  Time,  this  significant  inter-slope  variability  at  level  2  reveals  some  distinct  groups:  1)  those  for  which 
fatigue  related  decrement  is  well  predicted,  2)  those  for  which  it  is  over  predicted,  and  3)  those  for  which  it  is  under 
predicted.  Group  1  can  be  seen  in  the  plot  lines  grouped  around  the  center  of  Figure  20,  where  an  incremental 
change  in  FAST  relates  to  a  relatively  equal  incremental  change  in  PVT  lapses.  Group  2  is  represented  by  the  flat 
lines  across  the  bottom  of  Figure  20,  where  performance  decrements  predicted  by  FAST  do  not  materialize.  Group  3 
is  the  most  striking,  represented  by  the  steep  sloped  lines  running  distinctly  separate  from  the  other  plot  lines.  Here, 
actual  performance  suffers  at  a  much  greater  rate  than  what  is  predicted  by  FAST.  Operationally,  the  only  acceptable 
predictive  ability  is  for  individuals  in  Group  1 .  As  previously  noted,  over-prediction  can  result  in  inefficient  use  of 
manpower,  and  under-prediction  can  create  a  hazardous  working  environment. 
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Figure  19.  Individual  slopes  for  PVT  Lapses  in  relation  PMI  SV  in 
millimeters  per  second  (group  mean  centered  values).  There  was  a 
significant  group  effect  (*)  and  significant  individual  differences  (+) 
such  that,  on  average,  lapses  increased  as  saccadic  velocity  decreased; 
however,  the  nature  of  that  relation  varied  significantly  from  subject  to 
subject. 
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Figure  20.  Individual  slopes  for  PVT  Lapses  in  relation  to  predicted 
performance  in  FAST  (group  mean  centered  values).  There  was  a 
significant  group  effect  (*)  and  significant  individual  differences  (+) 
such  that,  on  average,  a  predicted  drop  in  performance  by  FAST 
translated  into  an  increase  in  PVT  Lapses;  however,  the  nature  of  that 
relation  varied  significantly  from  subject  to  subject. 


The  Relation  between  the  SSS  and  PVT  Lapses 

The  final  bivariate  relation  examined  was  between  two  conceptual  outcome  variables,  PVT  lapses  and  the 
SSS.  While  not  defined  as  outcome  and  predictor  a  priori,  this  relation  is  theoretically  interesting  in  that  it  allows 
determination  of  whether  an  individual’s  subjective  evaluation  of  their  fatigue  state  from  the  SSS  is  predictive  of 
their  objective  fatigue-related  performance  on  the  PVT.  Level  1  and  level  2  equations  were  significant  for  the  SSS 
predicting  PVT  lapses,  indicating  that,  in  general,  subjective  sleepiness  is  predictive  of  PVT  vigilance,  but  there  are 
significant  individual  differences  in  that  general  relation  (see  Figure  21).  Visual  inspection  of  Figure  21  reveals  that 
all  subjects  report  getting  progressively  more  tired.  However,  for  the  individuals  represented  by  lines  with  flat 
slopes,  increasing  sleepiness  does  not  correlate  with  an  actual  drop  in  performance.  For  subjects  with  relatively  steep 
slopes  subjective  sleepiness  is  related  to  performance  decrements.  Again,  we  are  presented  with  low  and  high 
fatigue  susceptible  subjects.  The  operational  impact  of  this  relation  is  the  clearest  of  the  Stage  2  analyses:  asking 
someone  how  sleepy  they  are  holds  variable  diagnostic  value  in  terms  of  predicting  subsequent  performance. 
Operationally  this  emphasizes  the  importance  of  objective  fatigue  measurement,  such  as  with  the  metrics  currently 
evaluated  in  this  report. 


Figure  2 1 .  Individual  slopes  for  PVT  Lapses  in  relation  to  SSS  scores 
(group  mean  centered  values).  There  was  a  significant  group  effect  (*) 
and  significant  individual  differences  (+)  such  that,  on  average,  as 
subjective  sleepiness  increased,  PVT  Lapses  increased;  however,  the 
nature  of  that  relation  varied  significantly  from  subject  to  subject. 
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Moderation  HLMs  using  Baseline  PVT  Performance 

Examining  the  bivariate  relations  as  a  group,  the  largest  effeets  of  fatigue,  as  well  as  the  largest  relations  of 
predictors  explaining  fatigue,  appear  to  be  among  subjects  who  had  the  largest  number  of  lapses.  This  trend  suggests 
that  variability  in  the  relations  may  be  explained  by  the  participants’  average  baseline  PVT  performance  from  the 
Practice  phase.  Using  the  bivariate  analyses  that  had  a  significant  random  effect  of  the  predictor,  a  series  of  analyses 
in  which  average  number  of  PVT  lapses  during  the  Practice  phase  was  included  as  a  level  2  variable.  A  significant 
effect  of  Practice  phase  PVT  lapses  at  level  2  would  indicate  a  moderating  influence  of  baseline  performance  on 
subsequent  performance  when  predicted  by  the  level  1  variable.  For  example,  prediction  of  PVT  lapses  by 
FF  shiftACC  while  fatigued  can  be  partially  explained  by  an  individual’s  baseline  PVT  performance  (see  Table  9), 
such  that  a  higher  number  of  baseline  lapses  is  significantly  predictive  of  a  steeper  slope  representing  the  relation 
between  FF  shiftACC  and  PVT  lapses  while  fatigued.  The  same  pattern  is  true  when  baseline  PVT  lapses  are  used 
as  a  moderator  between  FAST  and  PVT  lapses  while  fatigued  (see  Table  9).  Practically,  this  means  that  the  worse  a 
person’s  vigilance  is  when  rested,  the  more  extreme  their  negative  reaction  to  fatigue  will  be  when  subjected  to  sleep 
loss.  Operationally,  this  means  that  a  warfighter’s  vigilance  while  fatigued  can  be  predicted,  at  least  in  part,  by  their 
vigilance  while  rested,  above  and  beyond  the  considerable  predictive  ability  of  FF  shiftACC  and  FAST. 


Table  9.  Moderating  Effects  of  Baseline  Lapses  on  Performance  Outcome  =  PVT  Lapses _ 

Level  1  Level  2 


Variable 

Equation 

t 

df 

P 

Equation 

df 

P 

FFshiftACC 

Y  =  PO  +  Pl*(FF_shiftACC)  +  E 

-2.06 

116 

0.04 

PO  =  BOO  +  BOl 
*(BLAPSAVG)  +  RO 

PI  =B10  +  B11 
*(BLAPSAVG)  +  R1 

29.01 

13 

0.01 

FAST 

Y  =  B0  +  B1*(FAST)  +  R 

-3.876 

116 

0.00 

B0  =  G00  +  G01 
*(BLAPSAVG)  +  UO 

B1  =G10  +  G11 
*(BLAPSAVG)  +  U1 

230.14 

13 

0.00 

Note.  PVT  =  Psychomotor  Vigilance  Task,  FF  =  Flight  Fit,  RT  =  Reaction  Time,  shiftACC  =  Shifting  Accuracy,  and  FAST  = 
Fatigue  Avoidance  Scheduling  Tool. 


Multivariate  HLM 

Bivariate  analyses  established  the  significant  relations  of  six  individual  predictors  and  one  outcome  variable 
to  PVT  lapses.  Many  of  the  significant  predictors  are  conceptually  related,  such  as  FF  shiftACC  and  PMI  SV,  and 
may  share  statistical  explanatory  variance.  A  multivariate  HFM  using  all  significant  bivariate  predictors  except 
Time  was  therefore  constructed.  Time  was  excluded  as  it  is  assumed  to  be  theoretically  and  statistically  collinear 
with  the  other  predictors.  PVT  lapses  were  used  as  the  outcome.  The  purpose  of  this  analysis  was  to  determine 
which,  if  any,  variables  possessed  unique  predictive  ability  above  and  beyond  the  others.  Results  indicate  that  out  of 
FF  rawRT,  FF  daRT  FF,_shiftACC,  PMI  SV  and  FAST,  FF  daRT,  FF  shiftACC,  and  FAST  remained  significant 
predictors  of  PVT  lapses  at  level  1,  while  FF_rawRT  and  PMI  SV  dropped  out.  This  suggests  that  the  significant 
predictive  ability  of  FF  rawRT  and  PMI  SV  may  already  be  captured  by  aspects  of  FF  daRT,  FF  shiftACC,  and 
FAST. 

Stage  2  Summary 

Stage  2  analyses  were  conducted  to  establish  the  ability  of  significant  Stage  1  variables  to  predict 
performance  on  the  PVT.  HFM  was  used  in  order  to  observe  significant  relations  at  the  group  and  individual  levels 
simultaneously.  Three  aspects  of  Flight  Fit  (FF  rawRT,  FF  daRT,  and  FF  shiftACC),  one  of  PMI  Fit  2000 
(PMI  SV),  and  FAST  were  able  to  significantly  predict  PVT  lapses  at  the  group  level.  Of  these. 
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FF  shiftACC,  PMI  SV,  and  FAST  also  displayed  significant  individual  differences  in  their  relation  to  PVT 
lapses.  There  was  also  significant  inter-individual  variability  in  the  relation  between  SSS  scores  and  PVT  lapses, 
uncovering  a  disconnect  between  subjective  self-report  of  fatigue  and  its  objective  consequences.  Moderation 
analyses  revealed  that  baseline  PVT  lapses  can  be  used  to  predict  the  relation  between  shifting  accuracy  and  PVT 
lapses,  and  a  comprehensive  multivariate  HLM  demonstrated  that  divided  attention  reaction  time,  shifting  accuracy, 
and  FAST  predict  significant  systematic  variance  in  PVT  lapses  above  and  beyond  raw  reaction  time  and  saccadic 
velocity.  These  results  clarify  two  important  points  for  the  development  of  an  individualized  readiness-to-fly  fatigue 
measure.  First,  fatigue  measurement  must  take  individual  differences  into  account.  This  cannot  be  accomplished 
using  group  norms  as  the  comparison  baseline  for  individual  prediction.  Though  a  group-based  approach  will  give 
successful  approximations  for  most  people,  the  statistical  outliers  are  actually  the  most  critical  to  capture  when 
trying  to  predict  fatigue-related  performance  in  an  operational  context.  The  most  efficient  way  to  capture  these 
outliers  would  be  to  establish  individual  baselines  of  performance  and  then  track  changes  from  that  baseline  much  in 
the  same  way  the  PMI  Fit  2000  does,  with  focus  on  the  significant  predictors  found  in  this  stage.  The  moderation 
results  suggest  that  an  individual’s  rate  of  decline  due  to  fatigue  could  be  predicted  while  establishing  baseline 
vigilance  ability.  Second,  the  results  of  the  multivariate  HLM  demonstrate  the  need  to  carefully  balance  predictive 
power  and  practical  application.  While  raw  reaction  time  and  saccadic  velocity  do  not  explain  systematic  variance  in 
PVT  lapses  above  and  beyond  the  other  significant  predictors,  they  are  the  fastest  assessments  and  least  obtrusive  of 
the  performed  tests.  To  further  inform  the  balance  between  predictive  power  and  practical  application.  Stage  3 
consisted  of  an  incremental  validity  analysis  to  evaluate  the  predictive  ability  of  different  conceptual  combinations 
of  variables  with  respect  to  operational  utility. 


Stage  3 

The  multivariate  HLM  from  Stage  2,  including  group  and  individual  difference  effects,  is  difficult  to 
translate  into  a  single  fatigue  prediction  algorithm.  Ideally,  accurate  prediction  would  be  based  on  an  individual 
equation  for  each  subject  in  which  the  respective  beta  weights  for  each  variable  change  according  to  inter-individual 
slope  variability.  The  individualized  algorithm  approach  is  beyond  the  scope  of  this  report,  though  the  significant 
level  1  equations  from  the  Stage  2  multivariate  HLM  are  further  examined  here.  Using  level  1  results  from  Stage  2 
analyses,  a  group-based  scoring  algorithm  was  constructed.  It  is  important  to  note  that  this  approach  has  the  same 
strengths  and  weaknesses  as  other  currently  used  group-  based  prediction  algorithms.  However,  its  conceptual  use 
for  this  report  is  not  necessarily  in  creating  a  mission-ready  fatigue  prediction  algorithm;  rather,  it  is  in  examining 
the  interaction  and  respective  contribution  of  cognitive  and  physiological  measures  of  fatigue  in  an  incremental 
fashion  in  a  single  equation.  To  further  explore  the  incremental  validity  of  each  significant  Stage  2  predictor  at  the 
group  level,  a  series  of  enter-method  General  Linear  Models  (GLMs)  was  constructed.  Because  an  established 
group-based  scoring  algorithm  is  already  included  in  this  report  (FAST),  the  first  analysis  compared  variance  in 
PVT  lapses  explained  by  FAST  alone  with  total  variance  explained  when  significant  Flight  Fit  and  PMI  predictors 
were  included  with  FAST.  Results  are  presented  in  Table  10.  As  in  the  Stage  2  analysis,  FAST  was  able  to  explain  a 
significant  amount  of  variance  in  PVT  lapses,  about  14%  (R-square  =  .138),  at  the  group  level.  Addition  of  raw 
reaction  time,  divided  attention  reaction  time,  shifting  accuracy,  and  saccadic  velocity  increased  the  amount  of 
variance  explained  to  about  36%  (R-square  =  .357),  a  significant  change  statistically  and  conceptually. 
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Table  10.  Incremental  Ability  of  FAST,  Flight  Fit  Subscores,  and  PMl  FIT  2000  Subscores  to 
Predict  Variance  in  PVT  Lapses _ 


Model 

Variables 

Equation 

R 

Square 

AF 

dfl 

df2 

P 

1 

FAST 

PVT  Lapses  =  FAST  *  -.371 

.138 

18.863 

1 

118 

.000 

FAST 

PVT  Lapses  =  (FAST  *  -.126)  + 

2 

FF  rawRT 

FF  daRT 

(FF  rawRT  RT  *  .029)  + 

(FF  daRT  *  .03)  +  (FF  shiftACC  * 

.357 

9.688 

4 

114 

.000 

FF  shiftACC 
PMI  SV 

-.424)  + (PMI  SV*-.211) 

Note.  FAST  =  Fatigue  Avoidanee  Seheduling  Tool;  FF  rawRT  =  Flight  Fit  raw  reaetion  time;  FF  daRT  =  Flight  Fit 
Divided  Attention  Reaetion  Time;  FF  shiftACC  =  Flight  Fit  shifting  aeeuraey  ;  PMI  SV  =  PMI  Saeeadie  Veloeity. 
All  equation  values  assumed  to  be  group  mean  eentered. 


GENERAL  DISCUSSION 


The  results  of  the  current  investigation  underscore  four  main  points:  fatigue  measurement  and  prediction 
must  take  individual  differences  into  account,  optimal  fatigue  measurement  requires  considering  objective  cognitive 
and  physiological  aspects,  operational  utility  is  promising  for  aspects  of  both  Flight  Fit  and  PMl  FIT  2000,  and 
additional  research  is  needed  to  establish  potential  for  implementation  of  these  test  instruments. 

Individual  Differences  in  Measuring  and  Predicting  Fatigue 

The  results  strongly  suggest  that  individual  differences  in  fatigue  susceptibility  must  be  taken  into  account  when 
measuring  and  predicting  fatigue.  Individual  differences  are  central  to  conceptualizing  fatigue  in  operational 
contexts  where  understanding  a  service  member’s  strengths  and  weaknesses  is  key  to  optimizing  task  assignment 
and  safety.  Other  fatigue  researchers  have  posited  that  fatigue  susceptibility  is  a  trait-like  characteristic  (Caldwell, 
2005),  which  has  systematic,  identifiable  neurobiological  and  physical  underpinnings  (Caldwell  et  ah,  2005;  Retey 
et  ah,  2006;  Kihgore  et  ah,  2009)  that  may  be  modified  through  training  (Klingberg,  Forssberg,  &  Westerberg, 

2002).  Though  performance  prediction  based  on  a  group  average  accounts  for  most  individuals,  those  who  are  not 
properly  categorized  under  such  an  approach  are  theoretically  and  practically  the  most  critical  to  capture.  For 
instance,  those  who  are  highly  susceptible  to  fatigue  may  require  additional  training,  or  more  tailored  scheduling, 
similar  to  physical  readiness  training.  Those  who  are  highly  resistant  to  fatigue  may  be  better  suited  to  situations  in 
which  sustained  vigilance  is  routinely  required,  such  as  in  Air  Traffic  Control  (ATC)  -  much  like  assigning  certain 
duties  according  to  physical  strength. 

Practically,  these  results  suggest  that  quantifying  individual  differences  can  be  easily  accomplished  by 
establishing  individualized  baselines  of  performance,  as  illustrated  by  the  PMI  Fit  2000.  By  taking  10  rested 
baseline  readings  prior  to  sleep  deprivation,  the  PMI  system  was  able  to  calculate  each  subject’s  performance  in 
terms  of  a  departure  from  that  subject’s  own  record,  no  matter  where  that  record  began  and  no  matter  the  rate  at 
which  that  departure  proceeded. 

Measuring  and  Predicting  Fatigue  with  Objective  Cognitive  and  Physiological  Aspects 

Cognitive  (e.g.,  tests  of  vigilance,  working  memory)  and  physiological  (e.g.,  fMRI,  EEC)  measurements  have 
been  used  previously  to  track  fatigue-related  changes,  and  previous  results  verily  that  they  are  similarly  affected  by 
sleep  deprivation  (Berka  et  ah,  2007).  Biomathematical  models  of  sleep  deprivation  and  performance  emphasize  that 
cognitive  and  physiological  aspects  of  fatigue  are  numerous,  interdependent,  and  complex  (Dinges,  2004;  Retey  et 
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al.,  2006).  The  results  from  the  current  study  confirm  that  the  predictive  ability  of  an  already  biomathematically- 
based  model,  FAST,  was  significantly  improved  by  adding  additional  individualized  cognitive  (Flight  Fit)  and 
physiological  (PMI  FIT  2000)  measures.  Operationally,  the  measures  tested  were  fast,  effective,  and  adaptable  to 
fatigue  vulnerability  across  and  within  individuals.  Future  individualized  fatigue  detection  tools  should  incorporate 
individualized  cognitive  and  physiological  measurements  to  maximize  predictive  ability  and  successful 
categorization. 

Results  also  suggest  that  these  individualized  cognitive  and  physiological  measurements  should  be  as  objective 
as  possible.  The  design  included  the  SSS  as  a  subjective,  self-report  measure  of  fatigue  to  observe  the  relation 
between  an  individual’s  feelings  of  fatigue  and  their  actual  performance  while  fatigued.  While  it  doesn’t  include  a 
self-rating  of  performance  or  performance  potential  per  se,  and  therefore  excludes  an  analysis  of  an  individual’s 
ability  to  perceive  or  predict  their  actual  performance,  it  does  quantify  an  individual’s  perception  of  their  general 
fatigue  state.  Results  reveal  a  closely  clustered  group  average  for  SSS  across  time,  such  that  all  subjects  reported 
getting  more  tired  as  time  awake  increased.  However,  the  individual  differences  in  performance  remain,  meaning 
that  fatigue  resistant  individuals  still  get  sleepy  -  they  just  continue  to  perform  at  baseline  levels  despite  their 
sleepiness.  These  results  suggest  that,  simply  put,  asking  an  individual  whether  they  are  too  tired  to  perform  has 
little  diagnostic  value  for  actual  performance,  especially  for  the  individuals  who  would  most  likely  do  well.  This 
point  further  emphasizes  the  need  for  multi-dimensional,  objective  evaluation  and  prediction  of  performance  while 
fatigued. 

Usability  and  Recommendations 

Aspects  of  Flight  Fit  and  PMI  FIT  2000  are  promising  for  use  as  valid  real-time  readiness-to-fiy  assessment 
tools  in  Naval  Aviation  squadrons,  but  key  adjustments  need  to  be  made  to  the  manufacturers’  current  scoring 
algorithms.  For  both  instruments,  the  manufacturers’  current  scoring  algorithms  are  inadequate  for  fatigue  detection 
in  a  Naval  Aviator  population.  In  order  to  detect  significant  results,  analyses  had  to  be  performed  using  raw  scores 
from  both  tests. 

Flight  Fit.  Initial  analyses  for  Flight  Fit,  presented  in  an  interim  report  to  the  sponsor,  were  based  on  the 
program’s  standardized  output.  This  output  was  presented  to  subjects  at  the  end  of  each  testing  session  as  a 
percentile  rank  of  performance  in  relation  to  established  norms  (i.e.,  90*  percentile  for  shifting  accuracy,  etc.) 
Unfortunately,  the  Flight  Fit  test  battery  was  normed  by  the  manufacturer  using  a  sample  of  truck  drivers,  not  Naval 
Aviators.  Observed  percentile  rank  scores  in  the  present  Naval  Aviator  sample  clustered  near  the  high  end  of  truck- 
driver  derived  normative  scores.  The  resulting  restriction  in  range  of  observed  normed  scores  resulted  in  no 
measurable  fatigue  effects.  When  raw  scores  were  used,  as  in  the  analyses  presented  in  this  report,  highly  significant 
fatigue  effects  were  exposed  on  multiple  component  subtests  of  Flight  Fit.  Before  Flight  Fit  or  any  test  battery  can 
be  utilized  as  a  readiness-to-fiy  screener  for  Naval  Aviators,  new  scoring  norms  based  on  Naval  Aviator 
performance  for  significant  component  subtests  must  be  established. 

PMI  FIT  2000.  For  PMI  FIT  2000,  the  failure  to  predict  fatigue  using  the  manufacturer’s  scoring  algorithm 
may  be  due  to  deviation  from  the  instrument’s  primary  intended  use:  detection  of  impairment  due  to  drugs  and 
alcohol.  Analysis  of  the  raw  data  revealed  saccadic  velocity  to  be  especially  sensitive  to  fatigue  effects.  However, 
the  instrument’s  other  pupillo metric  indices  and  overall  “FIT  Index”,  which  are  known  to  be  sensitive  to  the  effects 
of  drugs  and  alcohol,  were  unrelated  to  fatigue  in  our  sample.  The  manufacturer  has  developed  alternate  setting 
since  the  start  of  this  study  called  Fatigue  Analyzer  mode  that  is  based  heavily  on  saccadic  velocity.  Future  studies 
are  planned  using  the  Fatigue  Analyzer  mode  to  further  validate  the  use  of  the  PMI  Fit  2000  as  a  readiness-to-fly 
screener  in  Naval  Aviation. 

Though  NAMRL  cannot  recommend  use  of  Flight  Fit  and  the  PMI  FIT  2000  instruments  in  their  current 
form,  additional  testing  designed  to  re-norm  Flight  Fit  with  a  Naval  Aviation  population  using  only  the  significant 
component  subtests  reported  here,  and  further  testing  with  the  PMI  FIT  2000  in  Fatigue  Analyzer  mode,  are  both 
highly  recommended. 
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Next  Steps  /  Future  Directions 

The  current  results  suggest  that  aspects  of  Flight  Fit  and  PMI  FIT  2000  warrant  further  investigation  in  order 
to  better  determine  their  usefulness  to  the  Fleet  as  individualized,  readiness-to-fly  screeners.  Beyond  the  need  to 
adjust  or  modify  the  scoring  algorithms  currently  provided  by  the  manufacturers,  there  are  additional  questions  to 
examine  in  regard  to  the  operational  utility  of  these  tools.  For  instance,  sleep  deprivation  is  not  always  encountered 
in  a  sustained,  acute  manner.  Gradual,  chronic  sleep  restriction,  in  which  a  service  member  may  only  get  a  few  hours 
of  sleep  a  night  during  intensive  training  or  in  high-tempo  operations,  must  also  be  considered. 

Summary 

Over  the  course  of  25  hours  of  continual  wakefulness  in  a  laboratory  setting,  occulometric  measures  of 
saccadic  velocity  and  cognitive  performance  on  a  multi-faceted  test  battery  were  significantly  sensitive  to  fatigue. 
More  importantly,  these  tools  were  able  to  identify  an  individual’s  susceptibility  to  performance  decrements 
associated  with  fatigue,  a  capability  not  available  with  tools  based  on  average  group  performance.  While  these 
results  are  promising,  further  evaluation  across  a  wider  array  of  individuals,  settings,  and  fatigue  durations  is  needed 
prior  to  military  implementation.  The  ultimate  goal  is  a  comprehensive  and  field-expedient  tool  for  transition  to  the 
Fleet,  capable  of  providing  an  accurate  assessment  of  specific  fatigue  states  at  the  level  of  the  individual  warfighter. 
This  would  reduce  the  negative  impact  of  fatigue  on  performance  and  inform  a  commander’s  decision-making  on 
manning  and  mission  readiness. 
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APPENDIX  A.  List  of  Abbreviations 


FF_rawRT-  Flight  Fit  Reaction  Time 
FF_vsRT-  Flight  Fit  Visual  Scanning  Reaction  Time 
FF_vsACC-  Flight  Fit  Visual  Scanning  Accuracy 
FF_daRT-  Flight  Fit  Divided  Attention  Reaction  Time 
FF_daACC-  Flight  Fit  Divided  Attention  Accuracy 
FF_SRT-  Flight  Fit  Shifting  Reaction  Time 
FF_shiftACC-  Flight  Fit  Shifting  Accuracy 

FF_fdRT-  Flight  Fit  Focus  Reaction  Time  in  the  Presence  of  Distracters 
FF_STM-  Flight  Fit  Short  Term  Memory 
PMI_SV  -  PMI  FIT  2000  Saccadic  Velocity 


APPENDIX  B.  Confidential  Medical  Questionnaire 
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Screening  Number: _  Participant  Number: _ (for  office  use  oniy)  Date: 

Gender  (check  one):  Maie  □  Femaie  □  Age: _ Height: _ Weight: _ 


Directions:  Circle  “Yes”  or  “No”.  These  auestions  are  beina  asked  to  ensure  vour  safety  in  this  study. 

Do  you  have  a  current  flight  physicai? 

Yes 

No 

Have  you  ever  been  diagnosed  with  any  significant  medicai  probiems?  (e.g.,  heart/circuiatory  disease) 

Yes 

No 

Have  you  ever  been  diagnosed  with  any  neuroiogicai  syndrome,  disorder,  or  injury?  (e.g.,  migraines,  epiiepsy,  or 
traumatic  brain  injury) 

Yes 

No 

Have  you  ever  been  diagnosed  with  any  psychiatric  disorder?  (e.g.,  depression  or  anxiety) 

Yes 

No 

Have  you  ever  been  diagnosed  with  any  sieep  reiated  disorders  (e.g.,  sieep  apnea,  insomnia,  narcoiepsy,  sieep 
waiking) 

Yes 

No 

Have  you  used  any  tobacco  products  in  the  iast  6  months? 

If  ves,  please  list  auantitv,  freauencv  and  tvoe  of  product: 

Yes 

No 

Do  you  take  any  prescribed  medication  on  a  regular  basis? 

If  ves,  please  list : 

Yes 

No 

Have  you  consumed  any  caffeine  within  the  past  48  hours? 

If  yes,  how  much? _  Yes  No 

Is  this  your  normai  amount? _ 

Have  you  consumed  any  aicohoi  within  the  past  48  hours?  Yes  No 


Indicate  aii  medication  you  have  used  in  the  past  24  hours. 

(circle  all  that  apply) 

a.  None  d.  Antihistamines 

b.  Sedatives/Tranquiiizers  e.  Decongestants 

c.  Aspirin/Tyienoi/any  anaigesic  f.  Other  (piease  specify) 


How  manv  hours  did  vou  sleep  last  niaht? 

Was  this  amount  sufficient? 

Yes 

No 

Females:  Are  you  currently  pregnant  or  lactating? 

Yes 

No 
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